Domain Adaptation to Summarize Human Conversations
نویسندگان
چکیده
We are interested in improving the summarization of conversations by using domain adaptation. Since very few email corpora have been annotated for summarization purposes, we attempt to leverage the labeled data available in the multiparty meetings domain for the summarization of email threads. In this paper, we compare several approaches to supervised domain adaptation using out-ofdomain labeled data, and also try to use unlabeled data in the target domain through semi-supervised domain adaptation. From the results of our experiments, we conclude that with some in-domain labeled data, training in-domain with no adaptation is most effective, but that when there is no labeled in-domain data, domain adaptation algorithms such as structural correspondence learning can improve summarization.
منابع مشابه
Question Detection in Spoken Conversations Using Textual Conversations
We investigate the use of textual Internet conversations for detecting questions in spoken conversations. We compare the text-trained model with models trained on manuallylabeled, domain-matched spoken utterances with and without prosodic features. Overall, the text-trained model achieves over 90% of the performance (measured in Area Under the Curve) of the domain-matched model including prosod...
متن کاملDeep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملSample-oriented Domain Adaptation for Image Classification
Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...
متن کاملDomain Adaptation for Relation Extraction with Domain Adversarial Neural Network
Relations are expressed in many domains such as newswire, weblogs and phone conversations. Trained on a source domain, a relation extractor’s performance degrades when applied to target domains other than the source. A common yet labor-intensive method for domain adaptation is to construct a target-domainspecific labeled dataset for adapting the extractor. In response, we present an unsupervise...
متن کاملLarge scale speech-to-text translation with out-of-domain corpora using better context-based models and domain adaptation
In this paper, we described the process of building a large-scale speech-to-text pipeline. Two target domains, daily conversations and travel-related conversations between two agents, for the English-German language pair (both directions) are examined. The SMT component is built from out-of-domain but freely-available bilingual and monolingual data. We make use of most of the known available re...
متن کامل